Members
Overall Objectives
Research Program
Application Domains
New Software and Platforms
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: New Results

Link-Heterogeneous work stealing for Branch-and-Bound Algorithms

Participants : T-T Vu, Bilel Derbel.

In this work [41] , we push forward the design of parallel and distributed optimization algorithms running on link-heterogeneous systems where network latencies can deeply impact performance. We consider parallel Branch-and-Bound (B&B), viewed as a generic algorithm searching in a dynamic tree representing a set of candidate solutions built dynamically. A major challenge is then to deal with the irregularity of B&B computations and to distribute workload evenly at runtime. In this context, the random work-stealing paradigm has been proved to be extremely beneficial. However, it is known to perform loosely in non-homogeneous distributed systems where communications costs are a major obstacle for high performance. We there-by investigate the design of an effective work-stealing protocol dealing with the heterogeneity of network link latencies. We propose a generic distributed algorithm which can be easily implemented to fit different types of heterogeneity. The proposed algorithm extends on reference approaches, namely Probabilistic Work Stealing (PWS), and Adaptive Cluster-aware Random Stealing (ACRS); by introducing new adaptive control operations that are shown to be highly accurate in increasing work locality and decreasing steals cost. Through emulations on top of a real test-bed, we provide a comprehensive experimental analysis including: (i) a comparative study on a broad range of harsh network scenarios going from flat networks to more hierarchical grid-like networks, and (ii) an in-depth analysis of protocols' behavior at the aim of gaining new insights into dynamic load-balancing in heterogeneous distributed environments. Over all experimented configurations, our results show that although the proposed protocol is not tailored for a specific networked platform, it can save 30% execution time in average compared to its competitors, while demonstrating high quality self-adjusting capabilities.